Search CORE

54 research outputs found

A standard based approach for biomedical knowledge representation

Author: A. Farkash
A. Shabo
C. Conti
D.M. Cusi
E. Salvi
F. Rizzi
H. Neuvirth
S. Bianchi
Y. Goldschmidt
Publication venue: 'IOS Press'
Publication date: 01/01/2011
Field of study

The new generation of health information standards, where the syntax and semantics of the content is explicitly formalized, allows for interoperability in healthcare scenarios and analysis in clinical research settings. Studies involving clinical and genomic data include accumulating knowledge as relationships between genotypic and phenotypic information as well as associations within the genomic and clinical worlds. Some involve analysis results targeted at a specific disease; others are of a predictive nature specific to a patient and may be used by decision support applications. Representing knowledge is as important as representing data since data is more useful when coupled with relevant knowledge. Any further analysis and cross-research collaboration would benefit from persisting knowledge and data in a unified way. This paper describes a methodology used in Hypergenes, an EC FP7 project targeting Essential Hypertension, which captures data and knowledge using standards such as HL7 CDA and Clinical Genomics, aligned with the CEN EHR 13606 specification. We demonstrate the benefits of such an approach for clinical research as well as in healthcare oriented scenarios

AIR Universita degli studi di Milano

Combination of scoring schemes for protein docking

Author: B Huang
C Zhang
CM Deane
D Kozakov
Dietmar Schomburg
F Melo
G Moont
H Neuvirth
I Halperin
IN Shindyalov
J Mintseris
JE Dennis
KE Gottschalk
L Lo Conte
M Meyer
O Martin
O Zimmermann
P Aloy
P Caffrey
P Chakrabarti
P Heuser
Philipp Heuser
R Development Core Team
RB Schnabel Koontz J.
RM Jackson
S Jones
V Grimm
WS Valdar
Publication venue: BioMed Central
Publication date: 01/08/2007
Field of study

Abstract Background Docking algorithms are developed to predict in which orientation two proteins are likely to bind under natural conditions. The currently used methods usually consist of a sampling step followed by a scoring step. We developed a weighted geometric correlation based on optimised atom specific weighting factors and combined them with our previously published amino acid specific scoring and with a comprehensive SVM-based scoring function. Results The scoring with the atom specific weighting factors yields better results than the amino acid specific scoring. In combination with SVM-based scoring functions the percentage of complexes for which a near native structure can be predicted within the top 100 ranks increased from 14% with the geometric scoring to 54% with the combination of all scoring functions. Especially for the enzyme-inhibitor complexes the results of the ranking are excellent. For half of these complexes a near-native structure can be predicted within the first 10 proposed structures and for more than 86% of all enzyme-inhibitor complexes within the first 50 predicted structures. Conclusion We were able to develop a combination of different scoring schemes which considers a series of previously described and some new scoring criteria yielding a remarkable improvement of prediction quality.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ElliPro: a new structure-based tool for the prediction of antibody epitopes

Author: Alessandro Sette
B Peters
Bjoern Peters
D Schneidman-Duhovny
E Westhof
H Neuvirth
HM Berman
Huynh-Hoa Bui
J Novotny
JA Greenbaum
JG Mandell
JM Thornton
Julia Ponomarenko
JV Ponomarenko
MHV Van Regenmortel
MJ Gomara
MS Bijker
Nicholas Fusseder
P Haste Andersen
Philip E Bourne
SF Altschul
T Fawcett
U Kulkarni-Kale
WD Bradford Jr
Wei Li
WG Laver
WR Taylor
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background Reliable prediction of antibody, or B-cell, epitopes remains challenging yet highly desirable for the design of vaccines and immunodiagnostics. A correlation between antigenicity, solvent accessibility, and flexibility in proteins was demonstrated. Subsequently, Thornton and colleagues proposed a method for identifying continuous epitopes in the protein regions protruding from the protein's globular surface. The aim of this work was to implement that method as a web-tool and evaluate its performance on discontinuous epitopes known from the structures of antibody-protein complexes. Results Here we present ElliPro, a web-tool that implements Thornton's method and, together with a residue clustering algorithm, the MODELLER program and the Jmol viewer, allows the prediction and visualization of antibody epitopes in a given protein sequence or structure. ElliPro has been tested on a benchmark dataset of discontinuous epitopes inferred from 3D structures of antibody-protein complexes. In comparison with six other structure-based methods that can be used for epitope prediction, ElliPro performed the best and gave an AUC value of 0.732, when the most significant prediction was considered for each protein. Since the rank of the best prediction was at most in the top three for more than 70% of proteins and never exceeded five, ElliPro is considered a useful research tool for identifying antibody epitopes in protein antigens. ElliPro is available at <url>http://tools.immuneepitope.org/tools/ElliPro</url>. Conclusion The results from ElliPro suggest that further research on antibody epitopes considering more features that discriminate epitopes from non-epitopes may further improve predictions. As ElliPro is based on the geometrical properties of protein structure and does not require training, it might be more generally applied for predicting different types of protein-protein interactions.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

eScholarship - University of California

Predicting protein-protein binding sites in membrane proteins

Author: A Elofsson
A Koike
A Liaw
AJ Bordner
AJ Bordner
AJ Bordner
AJ Bordner
Andrew J Bordner
B Wang
C Yan
D Lupo
E Krissinel
GE Tusnady
H Chen
H Neuvirth
HX Zhou
I Res
JR Bradford
L Breiman
L Feng
MA Yildirim
NJ Burgoyne
P Fariselli
R Development Core Team
R Landgraf
RC Edgar
S Hartel-Schenk
S Jones
S Jones
SA Eyers
SF Altschul
SH White
TM Bakheet
W Li
XW Chen
Y Ofran
Publication venue: BioMed Central
Publication date: 01/09/2009
Field of study

Abstract Background Many integral membrane proteins, like their non-membrane counterparts, form either transient or permanent multi-subunit complexes in order to carry out their biochemical function. Computational methods that provide structural details of these interactions are needed since, despite their importance, relatively few structures of membrane protein complexes are available. Results We present a method for predicting which residues are in protein-protein binding sites within the transmembrane regions of membrane proteins. The method uses a Random Forest classifier trained on residue type distributions and evolutionary conservation for individual surface residues, followed by spatial averaging of the residue scores. The prediction accuracy achieved for membrane proteins is comparable to that for non-membrane proteins. Also, like previous results for non-membrane proteins, the accuracy is significantly higher for residues distant from the binding site boundary. Furthermore, a predictor trained on non-membrane proteins was found to yield poor accuracy on membrane proteins, as expected from the different distribution of surface residue types between the two classes of proteins. Thus, although the same procedure can be used to predict binding sites in membrane and non-membrane proteins, separate predictors trained on each class of proteins are required. Finally, the contribution of each residue property to the overall prediction accuracy is analyzed and prediction examples are discussed. Conclusion Given a membrane protein structure and a multiple alignment of related sequences, the presented method gives a prioritized list of which surface residues participate in intramembrane protein-protein interactions. The method has potential applications in guiding the experimental verification of membrane protein interactions, structure-based drug discovery, and also in constraining the search space for computational methods, such as protein docking or threading, that predict membrane protein complex structures.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Homology Inference of Protein-Protein Interactions via Conserved Binding Sites

Author: A Marchler-Bauer
A Marchler-Bauer
A Marchler-Bauer
A Shulman-Peleg
AJ Walhout
Anna R. Panchenko
B Burgess
BA Shoemaker
BA Shoemaker
BA Shoemaker
BG Ma
BH Dessailly
D Kemmer
Dachuan Zhang
E Krissinel
E Krissinel
ED Levy
ER Jefferson
H Chen
H Neuvirth
H Yu
H Zhu
HM Berman
I Ispolatov
J Chen
J Kim
J Kirn
JE Dayhoff
JF Gibrat
K Hashimoto
K Henrick
L Xue
LR Matthews
M Gribskov
M Persico
Manoj Tyagi
MP Stumpf
N Slonim
P Aloy
P Fariselli
Q Xu
QC Zhang
Ratna R. Thangudu
RH Holm
RR Thangudu
S Henikoff
S Liang
S Mika
S Mintz
SF Altschul
Stephen H. Bryant
T Reguly
Thomas Madej
Vladimir N. Uversky
WE Newton
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The coverage and reliability of protein-protein interactions determined by high-throughput experiments still needs to be improved, especially for higher organisms, therefore the question persists, how interactions can be verified and predicted by computational approaches using available data on protein structural complexes. Recently we developed an approach called IBIS (Inferred Biomolecular Interaction Server) to predict and annotate protein-protein binding sites and interaction partners, which is based on the assumption that the structural location and sequence patterns of protein-protein binding sites are conserved between close homologs. In this study first we confirmed high accuracy of our method and found that its accuracy depends critically on the usage of all available data on structures of homologous complexes, compared to the approaches where only a non-redundant set of complexes is employed. Second we showed that there exists a trade-off between specificity and sensitivity if we employ in the prediction only evolutionarily conserved binding site clusters or clusters supported by only one observation (singletons). Finally we addressed the question of identifying the biologically relevant interactions using the homology inference approach and demonstrated that a large majority of crystal packing interactions can be correctly identified and filtered by our algorithm. At the same time, about half of biological interfaces that are not present in the protein crystallographic asymmetric unit can be reconstructed by IBIS from homologous complexes without the prior knowledge of crystal parameters of the query protein

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Identification of hot-spot residues in protein-protein interactions by computational docking

Author: AA Bogan
B Ma
H Neuvirth
HA Gabb
IS Moreira
IS Moreira
J Fernández-Recio
J Fernández-Recio
Juan Fernández-Recio
KS Thorn
L Li
L Lo Conte
M Almlöf
MG Mateu
MR Arkin
O Keskin
PL Toogood
R Chen
R Guerois
RL Shields
S Jones
S Jones
S Sogabe
S Zhong
SJ Darnell
Solène Grosdidier
T Kortemme
T Kortemme
T Man-Kuang Cheng
U Pieper
WL DeLano
Y Ofran
Z Hu
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background The study of protein-protein interactions is becoming increasingly important for biotechnological and therapeutic reasons. We can define two major areas therein: the structural prediction of protein-protein binding mode, and the identification of the relevant residues for the interaction (so called 'hot-spots'). These hot-spot residues have high interest since they are considered one of the possible ways of disrupting a protein-protein interaction. Unfortunately, large-scale experimental measurement of residue contribution to the binding energy, based on alanine-scanning experiments, is costly and thus data is fairly limited. Recent computational approaches for hot-spot prediction have been reported, but they usually require the structure of the complex. Results We have applied here normalized interface propensity (<it>NIP</it>) values derived from rigid-body docking with electrostatics and desolvation scoring for the prediction of interaction hot-spots. This parameter identifies hot-spot residues on interacting proteins with predictive rates that are comparable to other existing methods (up to 80% positive predictive value), and the advantage of not requiring any prior structural knowledge of the complex. Conclusion The <it>NIP </it>values derived from rigid-body docking can reliably identify a number of hot-spot residues whose contribution to the interaction arises from electrostatics and desolvation effects. Our method can propose residues to guide experiments in complexes of biological or therapeutic interest, even in cases with no available 3D structure of the complex.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Repositorio de Universidad de La Rioja

Predicting protein-protein interface residues using local surface structural similarity

Author: A Porollo
A Rossi
B Liu
B Ma
BA Shoemaker
C Yan
Drena Dobbs
F Wu
H Chen
H Hwang
H Naveed
H Neuvirth
HM Berman
HX Zhou
I Ezkurdia
J Fernández-Recio
J Janin
J Konc
J Konc
J Yu
JE Dayhoff
JL Chung
JL Chung
JR Bradford
K Henrick
L Bartoli
L Giot
M Guharoy
M Sikić
N Carl
N Carl
N Tuncbag
N Tuncbag
NJ Krogan
P Baldi
P Fariselli
QC Zhang
QC Zhang
R Liu
R Nussinov
RA Jordan
Rafael A Jordan
S Hubbard
S Jones
S Jones
S Li
S Liang
S Qin
SJ de Vries
Vasant Honavar
X Li
Y Murakami
Y Ofran
Yasser EL-Manzalawy
Publication venue: BioMed Central
Publication date: 01/01/2012
Field of study

Abstract Background Identification of the residues in protein-protein interaction sites has a significant impact in problems such as drug discovery. Motivated by the observation that the set of interface residues of a protein tend to be conserved even among remote structural homologs, we introduce <it>PrISE</it>, a family of local structural similarity-based computational methods for predicting protein-protein interface residues. Results We present a novel representation of the surface residues of a protein in the form of structural elements. Each structural element consists of a central residue and its surface neighbors. The <it>PrISE </it>family of interface prediction methods uses a representation of structural elements that captures the atomic composition and accessible surface area of the residues that make up each structural element. Each of the members of the <it>PrISE </it>methods identifies for each structural element in the query protein, a collection of <it>similar </it>structural elements in its repository of structural elements and weights them according to their similarity with the structural element of the query protein. <it>PrISEL </it>relies on the similarity between structural elements (i.e. local structural similarity). <it>PrISEG </it>relies on the similarity between protein surfaces (i.e. general structural similarity). <it>PrISEC</it>, combines local structural similarity and general structural similarity to predict interface residues. These predictors label the central residue of a structural element in a query protein as an interface residue if a weighted majority of the structural elements that are similar to it are interface residues, and as a non-interface residue otherwise. The results of our experiments using three representative benchmark datasets show that the <it>PrISEC </it>outperforms <it>PrISEL </it>and <it>PrISEG</it>; and that <it>PrISEC </it>is highly competitive with state-of-the-art structure-based methods for predicting protein-protein interface residues. Our comparison of <it>PrISEC </it>with <it>PredUs</it>, a recently developed method for predicting interface residues of a query protein based on the known interface residues of its (global) structural homologs, shows that performance superior or comparable to that of <it>PredUs </it>can be obtained using only local surface structural similarity. <it>PrISEC </it>is available as a Web server at <url>http://prise.cs.iastate.edu/</url> Conclusions Local surface structural similarity based methods offer a simple, efficient, and effective approach to predict protein-protein interface residues.</p

Digital Repository @ Iowa State University (ISU)

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Comparison of Classifier Fusion Methods for Predicting Response to Anti HIV-1 Therapy

Author: A Altmann
Anders Sönnerborg
André Altmann
B Larder
Daniel Struck
Derya Unutmaz
Ehud Aharoni
Eugen Schülter
Francesca Incardona
G Rogova
H Akaike
Hani Neuvirth
J Kittler
Joachim Büch
K Roomp
K Woods
LI Kuncheva
LI Kuncheva
LI Kuncheva
LM Mansky
M Rosen-Zvi
MA Hall
Mattia Prosperi
Maurizio Zazzi
Michal Rosen-Zvi
N Beerenwinkel
N Beerenwinkel
N Beerenwinkel
R Liu
RH Lathrop
Rolf Kaiser
S le Cassie
SE Sinisi
SM Hammer
T Fawcett
T Lengauer
Thomas Lengauer
VA Johnson
Y Huang
Yardena Peres
Publication venue: Public Library of Science
Publication date: 01/01/2008
Field of study

BACKGROUND: Analysis of the viral genome for drug resistance mutations is state-of-the-art for guiding treatment selection for human immunodeficiency virus type 1 (HIV-1)-infected patients. These mutations alter the structure of viral target proteins and reduce or in the worst case completely inhibit the effect of antiretroviral compounds while maintaining the ability for effective replication. Modern anti-HIV-1 regimens comprise multiple drugs in order to prevent or at least delay the development of resistance mutations. However, commonly used HIV-1 genotype interpretation systems provide only classifications for single drugs. The EuResist initiative has collected data from about 18,500 patients to train three classifiers for predicting response to combination antiretroviral therapy, given the viral genotype and further information. In this work we compare different classifier fusion methods for combining the individual classifiers. PRINCIPAL FINDINGS: The individual classifiers yielded similar performance, and all the combination approaches considered performed equally well. The gain in performance due to combining methods did not reach statistical significance compared to the single best individual classifier on the complete training set. However, on smaller training set sizes (200 to 1,600 instances compared to 2,700) the combination significantly outperformed the individual classifiers (p<0.01; paired one-sided Wilcoxon test). Together with a consistent reduction of the standard deviation compared to the individual prediction engines this shows a more robust behavior of the combined system. Moreover, using the combined system we were able to identify a class of therapy courses that led to a consistent underestimation (about 0.05 AUC) of the system performance. Discovery of these therapy courses is a further hint for the robustness of the combined system. CONCLUSION: The combined EuResist prediction engine is freely available at http://engine.euresist.org

Public Library of Science (PLOS)

Crossref

Archivio della Ricerca - Università degli Studi di Siena

Directory of Open Access Journals

PubMed Central

UCL Discovery

The University of Manchester - Institutional Repository

MPG.PuRe

Exploiting residue-level and profile-level interface propensities for usage in binding sites prediction of proteins

Author: A Dubey
A Koike
A Rossi
AH Liu
AJ Bordner
AJ Bordner
AR Panchenko
AT Laurie
B Pils
B Thibert
B Wang
B Wilczynski
C Sander
C Yan
C Yan
C Zhang
CC Chang
D La
DH Morgan
F Osterberg
G Cheng
H Chen
H Deng
H Neuvirth
H Yao
H Yao
HX Zhou
I Res
I Xenarios
IM Nooren
IM Nooren
J Meiler
JL Chung
JR Bradford
JR Bradford
JW Torrance
K Henrick
KA Snyder
L Lo Conte
Lei Lin
MH Li
O Lichtarge
P Chakrabarti
Q Dong
Qiwen Dong
Qw Dong
QW Dong
S Jones
S Karlin
S Liang
SF Altschul
T Down
TJ Magliery
V Chelliah
VN Vapnik
W Kabsch
WS Valdar
WS Valdar
Xiaolong Wang
Y Kim
Y Ofran
Y Ofran
Yi Guan
Z Zhang
Publication venue: BioMed Central
Publication date: 01/05/2007
Field of study

Abstract Background Recognition of binding sites in proteins is a direct computational approach to the characterization of proteins in terms of biological and biochemical function. Residue preferences have been widely used in many studies but the results are often not satisfactory. Although different amino acid compositions among the interaction sites of different complexes have been observed, such differences have not been integrated into the prediction process. Furthermore, the evolution information has not been exploited to achieve a more powerful propensity. Result In this study, the residue interface propensities of four kinds of complexes (homo-permanent complexes, homo-transient complexes, hetero-permanent complexes and hetero-transient complexes) are investigated. These propensities, combined with sequence profiles and accessible surface areas, are inputted to the support vector machine for the prediction of protein binding sites. Such propensities are further improved by taking evolutional information into consideration, which results in a class of novel propensities at the profile level, i.e. the binary profiles interface propensities. Experiment is performed on the 1139 non-redundant protein chains. Although different residue interface propensities among different complexes are observed, the improvement of the classifier with residue interface propensities can be negligible in comparison with that without propensities. The binary profile interface propensities can significantly improve the performance of binding sites prediction by about ten percent in term of both precision and recall. Conclusion Although there are minor differences among the four kinds of complexes, the residue interface propensities cannot provide efficient discrimination for the complicated interfaces of proteins. The binary profile interface propensities can significantly improve the performance of binding sites prediction of protein, which indicates that the propensities at the profile level are more accurate than those at the residue level.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Prediction of protein binding sites in protein structures using hidden Markov support vector machine

Author: A Henschel
A Koike
A Kouranov
A Porollo
A Rossi
AJ Bordner
B Wang
Bin Liu
Buzhou Tang
C Chothia
C Yan
C Yan
C-T Chen
C-W Cheng
H Chen
H Kim
H Neuvirth
H-X Zhou
HX Zhou
I Ezkurdia
I Res
I Tsochantaridis
I Tsochantaridis
J Lafferty
J Song
J Song
J-L Chung
JD Fischer
JL Chung
JR Bradford
JW Torrance
K Henrick
L Holm
L Lo Conte
L Wang
Lei Lin
LR Rabiner
M Gribskov
M Vincent
M Šikić
MH Li
N Li
NJ Burgoyne
P Fariselli
Q Dong
Qiwen Dong
S Ahmad
S Liang
S Qin
SF Altschul
SF Altschul
T Joachims
T Zhang
TH Dang
W Kabsch
WK Kim
X-w Chen
Xiaolong Wang
Xuan Wang
Y Altun
Y Liu
Y Ofran
Y Ofran
Publication venue: BioMed Central
Publication date: 01/01/2009
Field of study

Abstract Background Predicting the binding sites between two interacting proteins provides important clues to the function of a protein. Recent research on protein binding site prediction has been mainly based on widely known machine learning techniques, such as artificial neural networks, support vector machines, conditional random field, etc. However, the prediction performance is still too low to be used in practice. It is necessary to explore new algorithms, theories and features to further improve the performance. Results In this study, we introduce a novel machine learning model hidden Markov support vector machine for protein binding site prediction. The model treats the protein binding site prediction as a sequential labelling task based on the maximum margin criterion. Common features derived from protein sequences and structures, including protein sequence profile and residue accessible surface area, are used to train hidden Markov support vector machine. When tested on six data sets, the method based on hidden Markov support vector machine shows better performance than some state-of-the-art methods, including artificial neural networks, support vector machines and conditional random field. Furthermore, its running time is several orders of magnitude shorter than that of the compared methods. Conclusion The improved prediction performance and computational efficiency of the method based on hidden Markov support vector machine can be attributed to the following three factors. Firstly, the relation between labels of neighbouring residues is useful for protein binding site prediction. Secondly, the kernel trick is very advantageous to this field. Thirdly, the complexity of the training step for hidden Markov support vector machine is linear with the number of training samples by using the cutting-plane algorithm.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

ScholarBank@NUS